TREC 2004 Genomics Track Experiments at IUB
نویسندگان
چکیده
This paper describes the methods we developed for the three tasks of the TREC Genomics Track, i.e., ad hoc retrieval, triage, and annotation tasks. For the ad hoc retrieval task, we used the classic vector space model and studied the use of query expansion and pseudorelevance feedback. Our submitted runs obtained a MAP of 0.183. For the triage task, we adopted a naı̈ve Bayes classifier trained on MeSH terms and used gene names as filters to rule out false positives. The obtained normalized utility score was 0.435. For the annotation task, we focused on document representation and applied a variant of the kNN classifiers. One of our submitted runs produced an F1 score of 0.561, ranking first out of 36 runs submitted for the annotation task.
منابع مشابه
RMIT University at TREC 2004
RMIT University participated in two tracks at TREC 2004: Terabyte and Genomics, both for the first time. This paper describes the techniques we applied and our experiments in both tracks, and discusses the results of the genomics track runs; the terabyte track results are unavailable at the time of manuscript submission. We also describe our new zettair search engine, in use for the first time ...
متن کاملRevisiting Again Document Length Hypotheses TREC 2004 Genomics Track Experiments at Patolis
The TREC-2004 Genomics track evaluation experiments at Patolis Corporation are described with a focus on the document length issues in different retrieval models such as TF*IDF or probabilistic language modeling approaches. In the genomics ad hoc retrieval task, combination of pseudo-relevance feedback and reference database feedback is applied. For the triage sub-task, we trained a SVM classif...
متن کاملUB at TREC 13: Genomics Track
This paper describes the experiments of the State University of New York at Buffalo in TREC 13. We participated in the Genomics track and submitted official runs to the Adhoc retrieval task. Our approach uses a language model IR system developed in house. We also present unofficial results for the triage sub-task of categorization task.
متن کاملDIMACS at the TREC 2005 Genomics Track
This report describes DIMACS work on the text categorization task of the TREC 2005 Genomics track. Our approach to this task was similar to the triage subtask studied in the TREC 2004 Genomics track. We applied Bayesian logistic regression and achieved good effectiveness on all categories. 1. TEXT CATEGORIZATION TASK The Mouse Genome Informatics (MGI) project of the Jackson Laboratory provides ...
متن کاملExperience of Using SVM for the Triage Task in TREC 2004 Genomics Track
This paper reports our knowledge-ignorant machine learning approach to the triage task in TREC2004 genomics track, which is actually a text categorization problem. We applied Support Vector Machine (SVM) and found that information-gain based feature selection is helpful. Although we achieved decent performance in leave-one-out cross-validation experiments, the evaluation result on the test data...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004